Editorial

        Next >  
Data-driven systems biology approaches Free
Luonan Chen
Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China. E-mail: lnchen@sibs.ac.cn *Correspondence to:
J Mol Cell Biol, Volume 9, Issue 6, December 2017, 435-435,  https://doi.org/10.1093/jmcb/mjy004

Rapid accumulation of biological data is driving the system-level study from describing complex phenomena to understanding molecular mechanisms, from analyzing individual components to understanding their networks and systems (Chen et al., 2009; Chen and Wu, 2015). Data-driven systems biology approaches are emerging as essential tools to gain new insights into biological processes or systems. In this issue, we collect several research articles, which are all related to such data-driven methodologies or their applications, ranged from new computational tools (GWAS and signal pathway studies) to molecular biology (CSRE inference) and disease analyses (detection of the disease tipping point by DNBs and key genes during glioma progression).

The traditional GWAS methods based on genomic data usually ignore tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this drawback, Drs Jiang and Wong’s groups proposed a Bayesian approach called SIGNET to integrate GWAS data and multiple tissue-specific gene networks for inferring phenotype-associated genes and relevant tissues. Their studies clearly demonstrated the power of SIGNET in both deciphering genetic basis and discovering biological insights of a phenotype.

Drs Zhao and Li developed HISP, another data-driven method presented in this issue, to determine both the topologies and the directions of signaling pathways based on integer linear programming and genetic algorithm. HISP based on gene expression and knockout data can determine the optimal topologies of signaling pathways in an accurate manner. Benchmark results on yeast MAPK signaling pathways demonstrated its efficiency. In particular, HISP unveiled a high-resolution EGFR/ErbB signaling pathway in human hepatocytes, where many signaling interactions were missing by existing computational approaches.

Decoding the epigenomes (e.g. histone modifications) into functional regulatory elements is a challenging task in computational systems biology. Drs Wang and Zhang adopted a data-driven framework to comprehensively characterize CSREs and their histone modification codes in the human epigenomes of five histone modifications across 127 tissues or cell types. Moreover, clustering CSREs with their specificity signals revealed distinct histone codes, demonstrating the diversity of functional roles of CSREs within the same cell or tissue.

Data-driven approaches are also widely applied to comprehensive studies of complex diseases at a system level. Glioma is a complex disease with limited treatment options. Despite certain advances in glioma research, studies into molecular events (mutations) have lagged considerably. Zhang and colleagues identified EZH2, KMT2C, and CHD4 as important genes in glioma in addition to the known gene IDH1/2, by analyzing currently available databases. They showed that genomic alterations of PIK3CA, CDKN2A, CDK4, FIP1L1, or FUBP1 collaborate with IDH mutations to negatively affect patients’ survival in lower grade gliomas, which provided new insights into understanding molecular mechanisms of the disease progression.

Identifying the tipping point of a complex disease before disease onset is of great importance for disease prevention and treatment. Actually, hunting cancer tipping point is considered as the next big problem in biology and medicine. To detect the tipping points of diseases, a new type of biomarker or data-driven method, i.e. DNB method, was developed and applied to uncovering the critical transition from chronic inflammation to hepatocellular carcinoma (HCC). DNB method has a solid background of nonlinear dynamical theory, and was both theoretically and numerically shown to be able to serve as a general early-warning signal indicating the tipping point just before the disease deterioration (Zhang et al., 2015). The study by Drs Zeng, Chen, and Wu’s groups demonstrated that dysfunction of PLA2G6 and CYP2C44 as core DNB members signals imminent carcinogenesis from chronic inflammation to HCC based on the time-series proteomic data of woodchuck hepatitis virus/c-myc mice and age-matched wt-C57BL/6 mice, which was also validated by further experiments.